feat: optimization technique related validations. #4921

cj-zhang · 2024-11-08T02:19:10Z

Description of changes:

Enable quantization and compilation in the same optimization job via ModelBuilder and add validations to block compilation jobs using TRTLLM and Llama-3.1. (revisions needed here, validations are outdated: https://github.com/aws/sagemaker-python-sdk/pull/4875/commits)
Require EULA acceptance when using a gated 1p draft model via ModelBuilder. (optimize() and deploy() wrapper)
Require EULA acceptance when using a gated 1p draft model via JumpStartModel constructor + set_deployment_config() + ModelBuilder.set_deployment_config()

Testing done:

Calling ModelBuilder.optimize() w/speculative_decoding_config set to use 1p gated draft model triggers validation when AcceptEula is false, and sets the data source ModelAccessConfig to True when EULA is accepted
Called ModelBuilder.set_deployment_config(config_name, instance_type, accept_draft_model_eula), JumpStart(config_name, instance_type, accept_draft_model_eula), JumpStart.set_deployment_config(config_name, instance_type, accept_draft_model_eula) to verify all functionality
UTs are not working locally. Monitoring them via the checks in this PR.

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

I have read the CONTRIBUTING doc
I certify that the changes I am introducing will be backward compatible, and I have discussed concerns about this, if any, with the Python SDK team
I used the commit message format described in CONTRIBUTING
I have passed the region in to all S3 and STS clients that I've initialized as part of this change.
I have updated any necessary documentation, including READMEs and API docs (if appropriate)

Tests

I have added tests that prove my fix is effective or that my feature works (if appropriate)
I have added unit and/or integration tests as appropriate to ensure backward compatibility of the changes
I have checked that my tests are not configured for a specific region or account (if appropriate)
I have used unique_name_from_base to create resource names in integ tests (if appropriate)
If adding any dependency in requirements.txt files, I have spell checked and ensured they exist in PyPi

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…ModelBuilder and add validations to block compilation jobs using TRTLLM an Llama-3.1.

…ilder.

…with gated draft model is selected

Lokiiiiii

It is not clear to me how users can provide a JS model as a draft model in ModelBuilder.optimize . Can you please provide an example and what happens in the case of a gated draft model ?

Lokiiiiii · 2024-11-08T15:20:51Z

src/sagemaker/serve/utils/optimize_utils.py

+    return additional_model_data_source.get("S3DataSource").get("S3Uri", None)
+
+
+def _extract_deployment_config_additional_model_data_source_s3_uri(


Duplicate of _extract_additional_model_data_source_s3_uri ?

Deployment config uses Pascal case while the PySDK model will use snake case.

Ack.

ToDo: We need to find a different way of closing these differences.

src/sagemaker/serve/utils/optimize_utils.py

Lokiiiiii · 2024-11-08T15:31:02Z

src/sagemaker/serve/utils/optimize_utils.py

+
+            if not accept_eula:
+                raise ValueError(
+                    "Gated draft model requires accepting end-user license agreement (EULA)."


Add a note on how the user can do this. Eg: Set parameter to true.

We can re-use messaging from jumpstart.factory.model
Consider re-using that through a helper function.

src/sagemaker/serve/utils/optimize_utils.py

src/sagemaker/jumpstart/factory/model.py

src/sagemaker/jumpstart/model.py

src/sagemaker/jumpstart/types.py

…ployment_config flow validation in optimize_utils in favor of the one directly on jumpstart/factory/model.

Lokiiiiii

2 recent commits look good ! I would like to see more tests once the local issues are fixed.

Lokiiiiii · 2024-11-11T16:56:13Z

What is the behavior when a customer provides a SpeculativeDecodingConfig with "SageMaker" as model provider for a JS llama-3.1-70B model ?

… to differentiate SageMaker/JumpStart draft models.

cj-zhang · 2024-11-11T23:52:46Z

What is the behavior when a customer provides a SpeculativeDecodingConfig with "SageMaker" as model provider for a JS llama-3.1-70B model ?

Commit d10c475 introduces a ValueError when a proprietary SM draft model can't be found, and recommends the customer to try using Auto.

…gnment.

JGuinegagne

Please add unit test coverage for jumpstart construct changes in the PySDK.

src/sagemaker/jumpstart/model.py

src/sagemaker/jumpstart/types.py

src/sagemaker/jumpstart/utils.py

src/sagemaker/serve/builder/jumpstart_builder.py

src/sagemaker/serve/utils/optimize_utils.py

src/sagemaker/jumpstart/utils.py

src/sagemaker/jumpstart/model.py

src/sagemaker/jumpstart/utils.py

tests/unit/sagemaker/jumpstart/test_utils.py

Joseph Zhang added 2 commits November 7, 2024 11:37

Enable quantization and compilation in the same optimization job via …

7ec16e6

…ModelBuilder and add validations to block compilation jobs using TRTLLM an Llama-3.1.

Require EULA acceptance when using a gated 1p draft model via ModelBu…

cf70f59

…ilder.

cj-zhang requested a review from a team as a code owner November 8, 2024 02:19

cj-zhang requested a review from nileshvd November 8, 2024 02:19

cj-zhang had a problem deploying to manual-approval November 8, 2024 02:19 — with GitHub Actions Error

add accept_draft_model_eula to JumpStartModel when deployment config …

fcb5092

…with gated draft model is selected

gwang111 had a problem deploying to manual-approval November 8, 2024 04:08 — with GitHub Actions Error

gwang111 requested review from Lokiiiiii and removed request for nileshvd November 8, 2024 04:13

add map of valid optimization combinations

9489b8d

gwang111 had a problem deploying to manual-approval November 8, 2024 04:31 — with GitHub Actions Error

Lokiiiiii suggested changes Nov 8, 2024

View reviewed changes

whittech1 reviewed Nov 8, 2024

View reviewed changes

src/sagemaker/jumpstart/types.py Outdated Show resolved Hide resolved

Add ModelBuilder support for JumpStart-provided draft models.

5512c26

cj-zhang had a problem deploying to manual-approval November 9, 2024 00:31 — with GitHub Actions Error

Tweak draft model EULA validations and messaging. Remove redundant de…

c94a78b

…ployment_config flow validation in optimize_utils in favor of the one directly on jumpstart/factory/model.

cj-zhang had a problem deploying to manual-approval November 9, 2024 01:04 — with GitHub Actions Error

Lokiiiiii reviewed Nov 11, 2024

View reviewed changes

Add "Auto" speculative decoding ModelProvider option; add validations…

d10c475

… to differentiate SageMaker/JumpStart draft models.

cj-zhang had a problem deploying to manual-approval November 11, 2024 23:50 — with GitHub Actions Error

Fix JumpStartModel.AdditionalModelDataSource model access config assi…

8fb27a0

…gnment.

cj-zhang had a problem deploying to manual-approval November 12, 2024 01:21 — with GitHub Actions Error

move the accept eula configurations into deploy flow

779f6d6

gwang111 had a problem deploying to manual-approval November 12, 2024 01:23 — with GitHub Actions Error

Merge branch 'master' into QuicksilverV2

aef3a90

gwang111 had a problem deploying to manual-approval November 12, 2024 01:24 — with GitHub Actions Error

gwang111 requested a review from JGuinegagne November 12, 2024 04:38

move the accept eula configurations into deploy flow

b7b15b8

JGuinegagne changed the title ~~Add optimization technique related validations.~~ feat: optimization technique related validations. Nov 13, 2024

JGuinegagne requested changes Nov 14, 2024

View reviewed changes

src/sagemaker/jumpstart/utils.py Outdated Show resolved Hide resolved

fix naming and messaging

8f0083b

gwang111 had a problem deploying to manual-approval November 14, 2024 01:47 — with GitHub Actions Error

ModelBuilder speculative decoding UTs and minor fixes.

8b73f34

cj-zhang had a problem deploying to manual-approval November 14, 2024 02:16 — with GitHub Actions Error

Merge branch 'master' into QuicksilverV2

c06aef0

gwang111 had a problem deploying to manual-approval November 14, 2024 06:40 — with GitHub Actions Error

Fix set union.

09a54dc

cj-zhang had a problem deploying to manual-approval November 14, 2024 19:15 — with GitHub Actions Error

add UTs for JumpStart deployment

3b147cd

gwang111 had a problem deploying to manual-approval November 15, 2024 00:49 — with GitHub Actions Error

fix formatting issues

65cb5b3

gwang111 had a problem deploying to manual-approval November 15, 2024 01:04 — with GitHub Actions Error

gwang111 removed the request for review from AWS-pratab November 15, 2024 01:04

address validation comments

4d1e12b

gwang111 had a problem deploying to manual-approval November 15, 2024 18:19 — with GitHub Actions Error

fix doc strings

bf706ad

gwang111 had a problem deploying to manual-approval November 15, 2024 18:49 — with GitHub Actions Error

Add TRTLLM compilation + speculative decoding validation.

f121eb0

cj-zhang had a problem deploying to manual-approval November 15, 2024 18:58 — with GitHub Actions Error

Lokiiiiii approved these changes Nov 15, 2024

View reviewed changes

zhaoqizqwang previously approved these changes Nov 15, 2024

View reviewed changes

JGuinegagne previously approved these changes Nov 15, 2024

View reviewed changes

src/sagemaker/jumpstart/model.py Outdated Show resolved Hide resolved

src/sagemaker/jumpstart/model.py Outdated Show resolved Hide resolved

src/sagemaker/jumpstart/utils.py Outdated Show resolved Hide resolved

tests/unit/sagemaker/jumpstart/test_utils.py Outdated Show resolved Hide resolved

address nits

9148e70

gwang111 dismissed stale reviews from JGuinegagne and zhaoqizqwang via 9148e70 November 15, 2024 23:37

gwang111 deployed to manual-approval November 15, 2024 23:37 — with GitHub Actions Active

zhaoqizqwang approved these changes Nov 16, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: optimization technique related validations. #4921

feat: optimization technique related validations. #4921

cj-zhang commented Nov 8, 2024 •

edited by gwang111

Loading

Lokiiiiii left a comment

Lokiiiiii Nov 8, 2024

cj-zhang Nov 8, 2024

Lokiiiiii Nov 8, 2024

Lokiiiiii Nov 8, 2024

Lokiiiiii Nov 8, 2024

Lokiiiiii left a comment

Lokiiiiii commented Nov 11, 2024

cj-zhang commented Nov 11, 2024

JGuinegagne left a comment

		return additional_model_data_source.get("S3DataSource").get("S3Uri", None)


		def _extract_deployment_config_additional_model_data_source_s3_uri(

feat: optimization technique related validations. #4921

Are you sure you want to change the base?

feat: optimization technique related validations. #4921

Conversation

cj-zhang commented Nov 8, 2024 • edited by gwang111 Loading

Merge Checklist

General

Tests

Lokiiiiii left a comment

Choose a reason for hiding this comment

Lokiiiiii Nov 8, 2024

Choose a reason for hiding this comment

cj-zhang Nov 8, 2024

Choose a reason for hiding this comment

Lokiiiiii Nov 8, 2024

Choose a reason for hiding this comment

Lokiiiiii Nov 8, 2024

Choose a reason for hiding this comment

Lokiiiiii Nov 8, 2024

Choose a reason for hiding this comment

Lokiiiiii left a comment

Choose a reason for hiding this comment

Lokiiiiii commented Nov 11, 2024

cj-zhang commented Nov 11, 2024

JGuinegagne left a comment

Choose a reason for hiding this comment

cj-zhang commented Nov 8, 2024 •

edited by gwang111

Loading